Structural Features Extraction for Devnagari and Bangla Language Documents

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extraction of text-related features for condensing image documents

A system has been built that selects excerpts from a scanned document for presentation as a summary, without using character recognition. The method relies on the idea that the most significant sentences in a document contain words that are both specific to the document and have a relatively high frequency of occurrence within it. Accordingly, and entirely within the image domain, each page ima...

متن کامل

Definition Extraction using Linguistic and Structural Features

In this paper a combination of linguistic and structural information is used for the extraction of Dutch definitions. The corpus used is a collection of Dutch texts on computing and elearning containing 603 definitions. The extraction process consists of two steps. In the first step a parser using a grammar defined on the basis of the patterns observed in the definitions is applied on the compl...

متن کامل

Specification of UNL Deconverter for Bangla Language

At present the WWW represents a powerful tool for communication and information interchange. With simple mechanism, it is possible to access innumerable documents about a huge variety of topics, from any place around the world. However, despite the abundance of information, languages very often cause severe problems. When most of the web pages today are written in few most common languages like...

متن کامل

Structural Features of Chinese Language

Chinese language is quite different from many western languages in various structural features. It is not alphabetic. Large number of Chinese characters are ideographic symbols. The monosyllabic structure, the open vocabulary nature, the flexible wording structure with tones, and the flexibilities in word ordering are good examples of the structural features of Chinese language. It is believed ...

متن کامل

Zone-based Keyword Spotting in Bangla and Devanagari Documents

In this paper we present a word spotting system in text lines for offline Indic scripts such as Bangla (Bengali) and Devanagari. Recently, it was shown that zone-wise recognition method improves the word recognition performance than conventional full word recognition system in Indic scripts [29]. Inspired with this idea we consider the zone segmentation approach and use middle zone information ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Indian Journal of Science and Technology

سال: 2015

ISSN: 0974-5645,0974-6846

DOI: 10.17485/ijst/2015/v8i13/56453